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Parkinson's disease (PD) is a chronic neurodegenerative disorder with multifactorial etiolo- 
gy. In the past decade, the genetic causes of monogenic forms of familial PD have been defined. 
However, the etiology and pathogenesis of the majority of sporadic PD cases that occur in out- 
bred populations have yet to be clarified. The recent development of resources such as the In- 
ternational HapMap Project and technological advances in high-throughput genotyping have 
provided new basis for genetic association studies of common complex diseases, including 
PD. A new generation of genome-wide association studies will soon offer a potentially powerful 
approach for mapping causal genes and will likely change treatment and alter our perception 
of the genetic determinants of PD. However, the execution and analysis of such studies will re- 
quire great care. Journal of Movement Disorders 2010;3:1-5 
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In 2001, two reference versions of the human genome were published. 1,2 One human ge- 
nome sequence was reported by the Human Genome Sequencing Consortium and reflected 
the assembly of sequences derived from numerous donors, 1 whereas the other genome se- 
quence, released by Celera Genomics, was a consensus sequence derived from five individu- 
als. 2 However, both versions of the genome sequence represented the human genome as a hap- 
loid sequence, and generic variation was not annotated. Therefore, many researchers have st- 
udied how genetic variants contribute to phenotype diversity and have conducted large-scale 
studies to identify and catalogue nucleotides that differ among individuals. Initial studies fo- 
cused largely on understanding the range of patterns and frequencies of single nucleotide 
polymorphisms (SNPs). 3 " 5 As their prevalence and contribution to human traits and biology 
were realized, several consortia were formed, and systemic studies were performed to improve 
our understanding of diverse human genomic variants. 6 - 7 

The first complete human genome sequence of a single individual, Levy et al. 8 was published 
in 2007. Shortly thereafter, the second complete genome sequence of an individual, Watson, 
determined with next-generation sequencing technology, was published. 9 Subsequently, three 
additional genomes from anonymous individuals were sequenced: one Han Chinese (Asian), 
one Nigerian (African), and one Korean (Asian). 10 " 12 Although these data have rapidly in- 
creased our knowledge of the various forms of human genetic variation, our understanding of 
the location and frequencies of structural variants across the genome is still limited. Howev- 
er, an enormous amount of effort is being expended to identify the common genetic variations 
that contribute to the development of common complex diseases. 

This review is a general overview of human genetic variation and its contribution to Parkin- 
son's disease (PD). 

Classes of Human Genetic Variation 

Common vs. rare variants 

Human genetic variants are typically referred to as either common or rare to denote the 
frequency of the minor allele in the human population. Common variants are synonymous with 
polymorphisms, defined as genetic variants with a minor allele frequency of at least 1% in the 
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population, whereas rare variants have a minor allele frequen- 
cy of less than 1%. 

Single nucleotide polymorphisms 

A SNP is a single base change in the DNA sequence at a par- 
ticular point compared with the "common" or "wild-type" se- 
quence. SNPs are the most prevalent class of genetic variation 
among individuals. It has been estimated that the human ge- 
nome contains at least 1 1 million SNPs, with about 7 million 
of these occurring with minor allele frequencies exceeding 
5% and the remaining having minor allele frequencies between 
1 and 5%. 

Structural variants 

Structural variants are defined as all base pairs that differ 
between individuals and that are not single nucleotide vari- 
ants. These include insertion-deletion variants (indels), block 
substitutions, inversions of DNA sequences, and copy number 
differences. The technical ability to detect structural variants 
in the human genome has only recently emerged. 613 

Genetic Association Studies 
in Parkinson's Disease 

Investigators conducting genetic association studies may 
target genes for investigation according to the known or pos- 
tulated biology and previous results, an approach known as 
candidate gene association. As a large-scale candidate gene as- 
sociation study, Chung et al. investigated the association of 
common variants in PARK loci and related genes with PD sus- 
ceptibility and age at onset in an outbred population (unpub- 
lished data: correspondence to Dr. Maraganore at NorthShore 
University Health System, Chicago, USA). They matched 1,103 
PD cases from the upper Midwest, USA, individually with un- 
affected siblings (w = 654) or unrelated controls (w = 449) from 
the same region. Using a sequencing approach in 25 cases and 
25 controls, SNPs in species-conserved regions oiPARK loci 
and related genes were detected. Additional tag SNPs were 
selected from the HapMap. A total of 235 SNPs and two vari- 
able-number tandem repeats (VNTRs) in the ATP13A2, DJ1, 
LRRK1, LRRK2, MAPT, Omi/HtrA2, PARK2, PINK1, SNCA, 
SNCB, SNCG, SPR, and UCHL1 genes were genotyped in 
all 2,206 subjects. Case-control analyses were performed to 
study the association with PD susceptibility, whereas case-only 
analyses were used to study the association with age at onset. 
Only MAPT SNP rs2435200 was associated with PD suscep- 
tibility after correcting for multiple testing [odds ratio (OR) = 
0.74, 95% confidence interval (CI) = 0.64-0.86, uncorrected p 
< 0.0001, log additive model]; however, 16 additional MAPT 
variants, seven SNCA variants, and one LRRK2, PARK2, and 
UCHL1 variant each had significant uncorrected p-values (Ta- 
ble 1). No significant associations were found for age at on- 



set after correcting for multiple testing. These results con- 
firmed the association of the MAPT and SNCA genes with PD 
susceptibility, but showed limited association of other PARK 
loci and related genes with PD. 

Alternatively, we may screen the entire genome for associ- 
ation, an approach that has recently transformed the field of 
genetic association studies. Such a "genome -wide association 
study (GWAS)" is hypothesis-free, as there is no bias or pre- 
sumptive list of candidate genes that are being tested. GWAS has 
greatly accelerated the pace of discovery of genetic association. 

As testing so many potential genes simultaneously carries 
the risk of finding many spurious associations, genetic variants 
that seem to have strong or suggestive statistical signals in an 
initial GWAS need to be tested for replication in other large 
data sets or studies. 

The boundaries between candidate gene studies and GWAS 
can become blurred, and the two types of study are not mutu- 
ally exclusive. 

Genome- Wide Association Study 
in Parkinson's Disease 

Six GWAS of PD have been published (Table 2). 1419 The st- 
udy by Maraganore et al. included 775 PD cases and 775 mat- 
ched controls. This study genotyped 198,345 informative ge- 
nomic SNPs, and found that a SNP within the semaphorin 5A 
gene (SEMA5A) had the lowest combined p-value (p = 7.62 
x 10" 6 ). 14 The authors also reported some suggestive findings 
for MAPT and SNCA, as well as other PARK loci and related 
genes. However, none of the findings was significant after 
correcting for multiple testing. The study by Fung et al. 15 ex- 
amined more SNP markers (408,000 SNPs), but also failed to 
observe an association of any genetic variation with PD sus- 
ceptibility after correcting for multiple testing; however, that 
study included only 276 PD cases and 276 unmatched con- 
trols. The study by Pankratz et al. 16 enrolled 857 familial PD 
cases and 867 controls, and observed suggestive associations 
for the GAKJDGKQ region on chromosome 4 (additive mod- 
el: OR = 1.69; p = 3.4 x 10" 6 ), MAPTSNPs (recessive model: 
OR = 0.56; p = 2.0 x 10 5 ), and the SNCA SNPs (additive mo- 
del: OR = 1.35; p = 5.5 x 10" 5 ). Despite enriching their sam- 
ple for genetic load (familial PD cases), none of the SNPs was 
significant after correcting for multiple testing. 

Recently, three GWAS confirmed that common variants in 
SNCA and MAPT genes increase PD susceptibility. 17 " 19 The 
study by Satake et al. 17 (2,011 cases and 18,381 controls) re- 
ported strong associations at SNCA on 4q22 (rsl 193 1074, 
OR=1.37,p = 7.35x W\PARK16on lq32 (p= 1.52 x 10" 12 ), 
BST1 on 4ql5, (p = 3.94 x 10" 9 ), andLRRK2 on 12ql2 (p = 2.72 
x 10" 8 ). The study by Simon-Sanchez et al. 18 (5,074 cases and 
8,551 controls) observed two strong association signals in 
the SNCA gene (rs2736990, OR = 1.23, p = 2.24 x 10 16 ) and 
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the locus (rs393 152, OR = 0.77, p= 1.95 x 10 16 ).Note 

that the two studies analyzed distinct two human populations 
(Japanese and European), and data were exchanged so that 
each group could replicate the other's findings. The two GW- 
AS of PD reported consistent significant findings at three 
loci (SAO, LRRK2, and PARK16). The BST1 gene was asso- 
ciated with PD only in the Japanese population, whereas mul- 
tiple variants within and near the MAPT gene were associated 
with PD exclusively in subjects of European ancestry. The most 
recent study by Edwards et al. 19 (1,752 cases and 1,745 con- 
trols) observed that the SAO SNP (rs2736990, OR = 1 .29, p = 
6.7 x 10 s ) and the MAPT region (rsl 1012, OR = 0.70, p = 5.6 
x 10" 8 ) were genome -wide significant. Importantly, the SNCA 
SNP rs2736990 is the same SAO SNP that showed the sec- 
ond highest nominally significant association with PD suscep- 
tibility in the large-scale candidate association study of Chung 
et al. The definite evaluation of the functions of these genetic 
variations awaits further investigation. 

Limitations of Genome- Wide 
Association Study in Identifying 
Causative Variants 

The GWAS approach still has substantial limitations. Enor- 
mous gaps remain in the ability to provide a biological expla- 
nation for why a genomic interval tracks with a complex trait. 
Although a tag SNP for a linkage disequilibrium (LD) bin is 
statistically associated with a trait, we have no idea of the pre- 
cise variants in the bin that have a causal role in contributing 
to variation in the trait. Moreover, tag SNPs are in LD not only 
with other SNPs, but also with common structural variants, 
the majority of which have not yet been identified. The caus- 
ative variants underlying GWAS test associations are likely 
to be regulatory rather than coding. Therefore, experiments 
should be conducted that simultaneously assay global gene 
expression and genome-wide variation in a large number of 
individuals to map genetic factors underlying differences in 
expression levels. These datasets may be valuable tools for 
identifying the causative variants and biological bases for 
many loci associated with a complex trait through GWAS. 

Implication of Genome- Wide 
Association Study Results 
for Other Populations 

Unless a particular functional variant has been identified un- 
ambiguously, testing a tag SNP that is associated with a dis- 
ease or trait in one population for risk assessment in an indi- 
vidual from another population can be problematic. This pro- 
blem stems both from allele frequency differences between 
populations and from the fact that the LD pattern across loci 
that mark or co-segregate with a putative causally associated 
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genetic variant may differ from population to population. 

Issues in Genome-Wide 
Association Study 

We need to consider several issues to conduct GWAS prop- 
erly. Genotyping error, genotype proportions (Hardy-Wein- 
berg equilibrium), multiple comparisons, replication, popula- 
tion stratification, genetic risk prediction, and the manipulation 
and interpretation of information should be addressed ade- 
quately. Publication bias (negative results tend to be not pub- 
lished) is another big problem. 

Future Directions 

Although the discovery of GWAS signals is exciting, the 
amount of work required to achieve and confirm causal vari- 
ants should not be underestimated. However, we predict that 
GWAS will identify common generic risk variants for PD and 
other common complex diseases. Future genomic technolo- 
gies, including whole genome sequencing and genome-wide 
measures of epigenetic variability and somatic variation, are 
likely to change the treatment strategy of PD and alter our per- 
ception of the genetic determination of the disease. Therefore, 
clinicians will need to have solid knowledge of genetic princi- 
ples and of the interpretation of complex genetic information. 
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